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Abstract 



We study recovering parity check relations for an unknown code from intercepted bitstream received from 
' Binary Symmetric Channel in this paper. An iterative column elimination algorithm is introduced which attempts 

o 

to eliminate parity bits in codewords of noisy data. This algorithm is very practical due to low complexity and use 
of XOR operator. Since, the computational complexity is low, searching for the length of code and synchronization 
is possible. Furthermore, the Hamming weight of the parity check words are only used in threshold computation 
and unlike other algorithms, they have negligible effect in the proposed algorithm. Eventually, experimental results 
are presented and estimations for the maximum noise level allowed for recovering the words of the parity check 
matrix are investigated. 
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I. Introduction 

Error correction codes are used in telecommunication systems to deal with noise and to increase 
data transmission ability. The message is encoded at the transmitter by channel encoder and decoded 
at the receiver knowing the parameters of code such as generator and parity check matrices. Specification 
recovery for communication systems from received signal, without any knowledge about the transmitter 
system is very complicated. The solution in 0] is to find the nearest (in sense of Hamming distance) 
(n, k)-code from the code used for channel coding. The associated decision problem is proved to be 
NP-complete. introduces an iterative decoding based algorithm using Gallager decoder which seeks 
for low-weight words. [2] and [3 J address the problem as reverse-engineering a communication systems. 

In [|4]| and [[5]| an algorithm based on rank computation has been introduced to find the correct length 
and synchronization, but a simple method is needed to find the ranks of sub-matrices. Two algorithms 
were proposed in [6 | to search for words of parity check matrix which can obtain code length and syn- 
chronization. If selected code length and synchronization are chosen correctly, looking for dual codewords 
will be sufficient for obtaining information of code length, synchronization, and words in the dual code 
lEfl . To decrease search space, Canteaut-Chabaud information set decoding algorithm Q has been used. 
However, we need a proper choice on the weight of parity check matrix words to recover them. 

In this paper we present a very low complexity algorithm to find the words of a parity check matrix 
without any assumption about the weight of the words. The main idea of this work is to use the parity check 
equations which lead to zero syndrome bits. In fact, to recover any word of the parity check matrix, h, we 
only need k independent codewords Vj, 1 < j < k, such that they satisfy the parity equation Vjh T = 0. 

The paper is organized as follows: In Section II column elimination operation and linear block codes 
are discussed. Parity elimination on linear block codes is presented in section III. In the next section 
iterative elimination algorithm is introduced for recovering dual codewords from noisy data. The required 
equations for threshold computation are obtained in section V. Experimental results are given in section 
VI, and finally section VII concludes the paper. 



II. Preliminaries 



A. Column Elimination Operation 
Consider matrix A, 
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where = {0, 1} for 1 < i < n, 1 < j < M. Linear independent columns of matrix A form a basis for 
vector space 5, and every column of A is an element of 5. Elementary column operation can be used to 
obtain column echelon form in order to find basis set, and it is performed on A as follows: 

Step 1: If an 7^ first colmn of A is assumed as a basis vector of vector space 5 and is eliminated from 
other dependent vectors by multiplication of transition matrix T^K Therefore, A^, the changed matrix 
in the first step, is 
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If an = 0, the first row should be replaced by a row of non-zero first element, then the primary column 
operation can be performed. 

Step 2: If a 2 2 = 0, multiplication of transition matrix, T^, results in 
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Similarly, if a 2 2 = 0, the second row should be replaced by a row of non-zero second element, then the 
primary column operation can be performed. 

Step j: If cijj 7^ 0, multiplication of transition matrix, results in 
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We call this process column elimination operation and continue performing until we achieve echelon 
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form. Apparently, the number of steps equals rank of matrix A. If any zero-column (column of all zeros) 
appears in any step, we shift this column to the right-hand side of matrix and apply the change to the 
transition matrix. 

B. Linear Block Codes 

Suppose C denotes a binary linear block code (n,k,d m i n ). Then, any codeword c G C is a linear 
combination of the generator matrix rows of C. In fact, any codeword can be represented as a linear 
combination of the elements of a A;— dimensional basis space. 

The systematic generator matrix G sys of a linear block code (n, k, d min ) can be represented as two 
sub-matrices: The k x k identity matrix, I k , and a k x (n — k) parity sub-matrix P such that 

G S ys = \Ik\P) 1 (5) 

H sys = -(P T \I n _ k ). (6) 

H sys , the systematic form of H, is an (n — k)— dimensional dual code of C; in other words, H is a full 
rank matrix ||8). The general form of generator matrix can be represented as two sub-matrices; The k x k 
matrix of basis vectors, B, and a k x [n — k) parity sub-matrix P such that 

G=(B\P), (7) 
GH T = 0. (8) 

The basis sub-matrix B can be represented as B = (B 1 ■ ■ ■ B k ) where Bj, 1 < j < k, is the jth column 
of B, and the parity sub-matrix P can be shown as P = (Pi ■ ■ ■ P n -k) , 1 < j < n — k, is the jth column 
of P. Note that 

P j = (Pi j ---P kj f . (9) 
It is clear that P^s can be rewritten as linear combinations of BjS. In fact, this relation between Pj and 
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BjS is the same as the relation defined by parity check equation of the code, i.e. 

k 

Pj = ^2 P™,3 B m, ^-<j<n-k (10) 

m=l 

III. Parity Elimination on Linear Block Codes 

Since rank of G is undoubtedly k, performing column elimination operation on G, will result in n — k 
all-zero columns in G^ k \ As mentioned earlier, column elimination operation is done in a manner that 
this n — k all-zero columns appear in the right-hand side of G^ k \ Therefore G can be presented as 

G^ = (B kxk \0), (11) 

where B kxk is a basis set for A;— dimensional vector space and is a k x (n — k) all-zero matrix. 

Theorem 1: If column elimination operation is performed on the generator matrix, G, of a binary linear 
block code C, a basis set for [n — A;)— dimensional vector space will appear in the columns of transition 
matrix, II = x • • • x Y^ k \ correspondent to the all-zero columns of G^ k \ 
Proof: The procedure of column elimination operation is 

G {k) = Gx r (i) x . . . x r W = Gxn= ( Sfexfc |0 fcxn _ fc ) , (12) 

On the other hand, any n — k linear independent vectors, H T ^ 0, that satisfy GH T = 0, can form a basis 
set for dual code space. As shown in equation ©, an identity matrix, I n -k, appears in G {k \ which means 
that n — k vectors in the right-hand side of matrix II are linear independent. According to eq. (ILTt . n — k 
right-hand side columns of II multiplied by G result in an all-zero sub-matrix in G^ k \ so they generate 
a basis for dual code space. ■ 
Example 1: Let G and H, the generator and parity check matrices of code C, Hamming code (7,4), 
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We perform column elimination operation for matrix A, its rows being codewords of C, 
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Step 1.1: 



aw = a x r« 



7 i o o o o o o X 



10 10 1 
1110 110 
1110 1 



Step 1.2: 



a(2) = A (D x r( 2 ) 



1 
1 
1110 11 
11110 



100 11 



\ 





^10 


00 00^ 




01 


00 1 01 


>r (2) = 


00 


h 


/ 


v oo 


) 



(15) 



(16) 



Step 1.3: 



A (3) = A (2) x p(3) 
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The transition matrix of the operation is as follows 



n = r^) x r< 2 ) x x 
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Note that, as shown in equation (fl9l ), three right-hand columns of transition matrix, IT, form a basis set 
for dual code 3— dimensional space. 

Example 2: Consider code C of example 1, and noisy matrix A below, 
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where noisy elements are denoted by boldface bits. Noisy elements are chosen arbitrarily so as to satisfy 
A x h T = 0, /i = (01 11001). We perform column elimination operation as follows: 
Step 2.1: 



A« = A x r« 
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Step 2.2: 



A (2) = A (l) x r (2) 
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Step 2.3: 



Step 2.4: 
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The transition matrix of the operation on the noisy data is as follows 
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n = r« x r< 2 > x r< 3 > x r< 4 > 
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The right-hand column of II is a dual codeword (i.e. h). Since, parity check equation of vector h = 
(0111001) expresses that sum of second, third, fourth and seventh columns of any matrix consisting of 
codewords (e.g. G) equals zero, but errors appeared in the first and fifth columns, which are not included 
in the parity check equation related to this vector. 

Note that there will certainly be three all-zero columns after column elimination operation, due to 
the rank of matrix being limited to the number of rows (and not columns). Therefore, all the columns 
correspondent to all-zero columns of transition matrix cannot be deemed as dual codewords. To find out 
whether columns corresponding to all-zero columns of transition matrix are dual codewords or not, more 
codewords are required. This will be discussed in subsection IV. 

Although the error rate is very high, around 0.07, one of the column of H T was recovered. This leads 
to expectancy of recovering dual codewords in relatively large error rates. 

Result 1: In order to obtain any basis vector of dual code space (any column of matrix H T ), only k 
independent (valid or invalid) codewords Vj, 1 < j < k axe needed such that they satisfy v jh T — 0, 1 < 
3<k. 

IV. Recovering Dual Codewords from Noisy Data 

Suppose that the transmission channel is a binary symmetric channel (BSC) with crossover probability 
e <C \. Suppose, also that the message is encoded using a linear block code C(n, k). Intercepted bitstream 
from the channel is first divided into words of length n, and matrix A is constructed such that each word 
is placed in one row. Note that the number of rows of matrix A should be chosen in a way that rank of 
A is not constrained by the number of rows (the number of codewords required for recovery operation is 
discussed in section V). Now, we introduce the iterative elimination algorithm to recover the parity check 
matrix. For e <C \, parity check matrix is achievable in low iterations. 
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Iterative Elimination Algorithm 

Iteration I: Consider a k x n window W\. We place the window on the first k rows of matrix A^ ' 1 ) 
and perform the column elimination operation such that echelon form is achieved, i.e. 
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Superscripts (., .) mean the number of column elimination step in this iteration and window number 
respectively. According to result 1, formation of any all-zero columns in matrix A^' 1 ) means that the 
correspondent column in the transition matrix is a dual codeword. But due to presence of noise (noise 
increases the rank of matrix), it is very likely not to have n — k all-zero columns in the right-hand side 
of matrix A^' 1 ), even in case that dual codewords appear in the transition matrix. In other words, in case 
of noisy channel, the probability of having all-zero syndrome matrix s (the quantity s = AH T is called 
the syndrome matrix and it is known at the receiver) is very low. 

Knowing that e <ti \ and from result 1, if we can find k independent rows vj in the window such that 
Vjh T = 0, 1 < j < k (h T is a column of H T ), then a low-weight column (for which a criterion will be 
calculated in section V) will appear in A^' 1 ), and the corresponding column in transition matrix n can 
be admitted (with some probability of false alarm) as a basis for dual code. Therefore, in case of noisy 
data, we attempt to build low-weigh columns in matrix A.( k '-\ 

Iteration 2: = A^ k '°\ Now we slide the window down by k rows {W2), and similar to the first 
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iteration, by performing column elimination operation on A(°' 2 ),we reduce W 2 to echelon form i.e. 
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Again we look for low-weigh columns in A( fc ' 2 ) and correspondent columns in n( 2 ). The iterations are 
continued until we obtain a parity check matrix. 

Notice that, at each iteration, the matrix obtained in the previous iteration is used.We will show in the 
next theorem that not only invalid parity equations of each iteration does not affect the next iteration, but 
also valid parity equations pass to the next iteration.The iterative elimination algorithm is demonstrated 
in table I. 

As shown in Q]| and the weight of A x h T is dependent on the channel's error; so, we use the 
weight of the columns of h^ k ' l \ 1 < % < ^ to detect the words of H. If a column in IT^ belongs to 
the dual code words, the weight of its correspondent column in A^' ' % > will be much smaller than 4r since 
e< |. This is the reason of searching for low-weigh columns. 

Theorem 2: In iterative elimination algorithm, error in window Wi-\ ((i — l)th iteration) does not pass 
to the next iterations and if any window consists of k independent valid codewords, then basis of dual 
code space will appear in transition matrix II w . 

Proof: Matrix A, in the transmitter, can be shown as follows 
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(28) 
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where B is a basis set for the vector space that columns of matrix A.( correct ^ are its elements, and p( correct ) 
is the sub-matrix of dependent columns. We show the columns of these two matrices as bj, 1 < j < k 



and p^ orrect \ 1 < j < k respectively. We have the following relation between bjS and p y - 
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where <%, m ) £ {0,1}. OL(j, m ) = 1 If p( correrf ) is dependent on b m , else, a(j, m ) = 0. Clearly, permutation of 
rows does not change au >m -\S. Therefore, columns of the transformed matrix satisfy equation (|29l . Note 
that, in this case, elements of bjS and p( correct ) s permute. 

Example 3: Assume that matrix l\i correct ) (with rank 2) is as follows 
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m=l 
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Now, consider window W\ in noisy data matrix A^ ' 1 ) (iteration 1). Before performing the operations, for 
each column in windows W\, W2 and Wi we can write 
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where term /3f . m j^, m x G {0, 1} is added to eq. (|29l due to channel error. In the proof, superscripts 

m=l 

(.,.,.) denote the number of column elimination step, iteration and window respectively. So (0, 1, i) means 
the ith window (ith k rows), before performing column elimination of 1th iteration, and (k, 1, i) means the 
2 th window, after performing column elimination of 1th iteration. Remember that in iterative elimination 
algorithm, operations are performed on the whole rows of matrix, regardless of being in the window or 
not. 

After performing iteration 1 of iterative elimination algorithm, consider window W\, and implicitly 
windows W2 and W%. The following relations can be written for columns in each window 
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we get transition matrix il^ as follows 
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Similarly, in iteration 2, we can write for windows W 2 and implicitly Wi before operation 
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and after operation 
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the resultant transition matrix of iteration 2 is 





"(1,1) + /3( 2 i,i) • • 


• &{n-k,l) + ^(n-fe,l) 


In 








a(i,fc) + /?( 2 i, fc ) • • 


' OL[n-k,k) + P 2 (n-k,k) 


0---0 








/„- 


-k 


^0 • • • 







\ 



(46) 



/ 

thus, error related to W\ does not appear in the transition matrix IT^ .Consider iteration % and window 
Wi before operations 
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after operations of iteration i, we have the following 
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where columns in the right-hand side area basis set for dual code space. 
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To recover any word of the parity check matrix, h, we need k independent codewords Vj, 1 < j < k at 
the ith iteration such that they satisfy Vjh T = 0, for 1 < j < k.h T appears in the transition matrix at the 



M 



ith iteration but because of the error in the data matrix , ^mj doesn't have zero weight. 

m=l 

V. Computing the Threshold T 

If h belongs to the dual code of C, Axh T will have a Gaussian distribution with mean 4^ (l — (1 — 2e) wtH<yl 
and variance^ ^1 — (1 — 2e) 2wtH ^ j ; Otherwise, A M h T will have a Gaussian distribution with mean 4f 



and variance (61. 



If Zj is the weight of zth column and is the (m,j) element of A^'*), we will have 
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According to the parity elimination algorithm there are two alternatives: 

T-L : If in the algorithm procedure the dual codeword appears in the jth column of the transition matrix, 
Zj will have a Gaussian distribution with mean and variance 

M 



ros, = y (l - (1 , (51) 
4, = ^fl-(l-2 £ r^V (52) 



4 

respectively. 

Hi'. Otherwise, Zj will have a Gaussian distribution with mean and variance 

m Zj = (53) 

False alarm probability (pf a ): is the probability of Zj being smaller than T, but its correspondent column 
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in the transition matrix does not appear in the parity check matrix. 
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Non-detection probability (p n d)' is the probability of Zj being greater than T, but its correspondent column 
in the transition matrix appears in the Parity check matrix. 
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Block codes and especially LDPC codes have words with very low weight in their parity check matrix. 
Thus, M does not significantly change even with increase of e [0. Therefore, we choose the greatest 
wt H {h). 



VI. Practical Experiments 

In the previous section we have shown how to find words of parity check matrix. Table II gives some 
results of performing parity elimination algorithm on random linear codes of rate |. 
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In comparison to (51, results obtained by iterative elimination algorithm are better. Decrease in the 
number of recovered dual codewords with increase in code length is due to the increase in error probability 
in each codeword. 

Complexity of O (n 3 ) and memory required of order O (n 2 ) makes this algorithm practical. Furthermore, 
the structure of algorithm enables us to perform it on long length codes, by exploitation of parallel 
processors. On the other hand, there is no need to a priori knowledge about the Hamming weigh of parity 
check matrix. 

VII. Conclusion 

In this paper we introduced a method with very low complexity to recover the parity check matrix of a 
binary linear block code. Due to using only XOR operator, the algorithm can be easily implemented. A 
very important contribution in this algorithm is that (unlike [6|) there is no need to search on Hamming 
weigh of dual codewords, i.e. Parity elimination algorithm can recover dual codewords of different weights 
simultaneously. 

As mentioned, increase in code length might result in more average number of bit errors in each 
codeword and therefore more iterations are required in the algorithm. But, there might still be enough 
valid codewords in the noisy data matrix, though not placed in the same window. Instead of searching 
for dual codewords, one can recover valid codewords. In the next paper, we will introduce a method to 
detect errors in the noisy data matrix. 
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TABLE I 
Parity Elimination Algorithm 

Choose M > eg. (59) 
Compute T with eg. (60) 
Choose number of Iteration / 

for i — 1 : I 

A_(M+i) = a. (0,0 TT( i+1 ) 

for j — k + 1 : n 

M o, -\ 

Zi = E 

m=l 

if Z 3 <T 

Store jth column of n^ +1 ^ 

end 

end 



TABLE II 

Average number of parity check words recovered in 10000 run of the parity elimination algorithm on random code 
of rate i. n is the length of codewords and £ is crossover probability and m = 10k, i = 10. 



e 
n 


0.001 


0.002 


0.005 


0.01 


0.02 


0.05 


32 


15.9 


10.5 


8.7 


5.6 


3.2 


1.4 


64 


31.7 


26.9 


18.6 


8.6 


1.7 


0.4 


128 


63.4 


51.1 


27.3 


1.2 








256 


127.2 


83.8 















22 



References 

[1] Valembois A. Detection and recognition of a binary linear code. Discrete Applied Mathematics. 2001;111(1-2):199 - 218. 
[2] Cluzeau M. Block code reconstruction using iterative decoding techniques. In: Information Theory, 2006 IEEE International Symposium 
on; 2006. p. 2269 -2273. 

[3] Cluzeau M, Tillich JP. On the code reverse engineering problem. In: Information Theory, 2008. ISIT 2008. IEEE International Symposium 
on; 2008. p. 634 -638. 

[4] Barbier G J Sicot, Houcke S. Algebric approach of the reconstruction of linear and convolutional error correcting codes. In Applied 

Mathematics and Computer Sciences. 2006;2:113-118. 
[5] Barbier J. Analyse de canaux de communication dans un contexte non-cooperatif. Paristech; 2007. 

[6] Cluzeau M, Finiasz M. Recovering a code's length and synchronization from a noisy intercepted bitstream. In: Information Theory, 

2009. ISIT 2009. IEEE International Symposium on; 2009. p. 2737 -2741. 
[7] Canteaut A, Chabaud F. A new algorithm for finding minimum-weight words in a linear code: application to McEliece's cryptosystem 

and to narrow-sense BCH codes of length 511. Information Theory, IEEE Transactions on. 1998 jan;44(l):367 -378. 
[8] Morelos-Zaragoza RH. The Art of Error Correcting Coding. John Wiley & Sons; 2002. 

[9] Chabot C. Recognition of a code in a noisy environment. In: Information Theory, 2007. ISIT 2007. IEEE International Symposium on; 
2007. p. 2211 -2215. 



